Automatic Evaluation of Tracheoesophageal Telephone Speech

نویسندگان

  • Korbinian Riedhammer
  • Tino Haderlein
  • Maria Schuster
  • Frank Rosanowski
  • Elmar Nöth
چکیده

The tracheoesophageal (TE) substitute voice is currently state–of–the–art treatment to restore the ability to speak after laryngectomy. The intelligibility while talking over a telephone is an important clinical factor, as it is a crucial part of the patients’ social life. An objective way to rate the intelligibility of substitute voices when talking over a telephone is desirable to improve the post–laryngectomy speech therapy. An automatic speech recognition (ASR) system was applied to 41 high quality recordings of post–laryngectomy patients. The ASR system was trained with normal, non–pathologic speech. It yielded a word accuracy (WA) of 36.9%±18.0%; compared to the intelligibility rating of a group of human experts the ASR system had a correlation coefficient of -.88. After downsampling the 41 recordings to telephone quality, the ASR system reached a WA of 26.4%±13.9% leading to a correlation coefficient of -.80. These results confirm that an ASR system can be used for objective intelligibility rating over the telephone. Samodejna evalvacija traheoezofagalnega telefonskega govora Traheoezofagalni nadomestni glas je trenutno najsodobnejši način obnove sposobnosti govora po laringektomiji. Razumljivost pri telefonskem pogovoru je pomemben kliničen dejavnik, saj predstavlja ključen del pacientove socialne interakcije. Za izboljšanje govorne terapije po laringektomiji je zaželen objektiven način ocenjevanja razumljivosti nadomestnih glasov pri telefonskem pogovoru. S sistemom za samodejno razpoznavanje govora (SRG) je bilo pregledanih 41 visoko kakovostnih posnetkov pacientov po laringektomiji. Sistem SRG so učili z normalnim, nepatološkim govorom. Odstotek pravilno razpoznanih besed je bil 36,9%±18,0%; v primerjavi z ocenami razumljivosti, ki jih je podala skupina strokovnjakov, je imel sistem SRG korelacijski koeficient -,88. Po znižanju frekvence vzorčenja 41 posnetkov na telefonsko kakovost je sistem SRG dosegel naslednji odstotek pravilno razpoznanih besed: 26,4%±13,9% oziroma korelacijski koeficient -,80. Ti rezultati potrjujejo, da je sistem SRG primeren za objektivno ocenjevanje razumljivosti telefonskega govora.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of automatic speech recognition to quantitative assessment of tracheoesophageal speech with different signal quality.

OBJECTIVE Tracheoesophageal voice is state-of-the-art in voice rehabilitation after laryngectomy. Intelligibility on a telephone is an important evaluation criterion as it is a crucial part of social life. An objective measure of intelligibility when talking on a telephone is desirable in the field of postlaryngectomy speech therapy and its evaluation. PATIENTS AND METHODS Based upon successf...

متن کامل

An Automatic Version of the Post-Laryngectomy Telephone Test

Tracheoesophageal (TE) speech is a possibility to restore the ability to speak after total laryngectomy, i.e. the removal of the larynx. The quality of the substitute voice has to be evaluated during therapy. For the intelligibility evaluation of German speakers over telephone, the Post-Laryngectomy Telephone Test (PLTT) was defined. Each patient reads out 20 of 400 different monosyllabic words...

متن کامل

Evaluation of Tracheoesophageal Substitute Voices Using Prosodic Features

Tracheoesophageal (TE) speech is a possibility to restore the ability to speak after laryngectomy, i.e. after the removal of the larynx. TE speech often shows low audibility and intelligibility which makes it a challenge for the patients to communicate. In speech rehabilitation the patient’s voice quality has to be evaluated. As no objective classification means exists until now and an automati...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Automatic Recognition and Evaluation of Tracheoesophageal Speech

Tracheoesophageal (TE) speech is a possibility to restore the ability to speak after laryngectomy, i.e. the removal of the larynx. TE speech often shows low audibility and intelligibility which also makes it a challenge to automatic speech recognition. We improved the recognition results by adapting a speech recognizer trained on normal, nonpathologic voices to single TE speakers by unsupervise...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006